LOGOS: Enabling Local Resource Managers for the Efficient Support of Data-Intensive Workflows within Grid Sites
نویسندگان
چکیده
In this study we discuss how to enable grid sites for the support of data-intensive workflows. Usually, within grid sites, tasks and resources are administrated by local resource managers (LRMs). Many of LRMs have been designed for managing compute-intensive applications. Therefore, data-intensive workflow applications might not perform well on such environments due to the number and size of data transfers between tasks. To improve the performance of such kind of applications it is necessary to redefine the scheduling policies integrated on LRMs. This paper proposes a novel scheme for efficiently supporting data-intensive workflows in LRMs within grid sites. Such scheme is partially implemented in our grid middleware LOGOS and used to improve the performance of a well known LRM: HTCondor. The core of LOGOS is a novel communication-aware scheduling algorithm (PPSA) capable of finding near-optimal solutions. Experiments conducted in this study showed that our approach leads to performance improvements up to 52% in the management of data-intensive workflow applications. 2 D. A. Monge, C. Garćıa Garino
منابع مشابه
A Throughput Maximisation Strategy for Scheduling Transaction Intensive Workflows on SwinDeW-G
With the rapid development of e-business, workflow systems now have to deal with transaction intensive workflows whose main characteristic is the huge number of concurrent workflow instances. For such workflows, it is important to maximise the overall throughput to provide good quality of service. However, most existing scheduling algorithms are designed for scheduling of a single complex scien...
متن کاملA New Job Scheduling in Data Grid Environment Based on Data and Computational Resource Availability
Data Grid is an infrastructure that controls huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. The heterogeneity and geographic dispersion of grid resources and applications place some complex problems such as job scheduling. Most existing scheduling algorithms in Grids only focus on one kind of Grid jobs which can be data...
متن کاملDynamic Workflows for Grid Applications
In the Grid computing community, there are several approaches to execute not only single tasks on single Grid resources but also to support workflow schemes that enable the composition and execution of complex Grid applications. The most commonly used workflow model for this purpose is the Directed Acyclic Graph (DAG). Within the establishment of the Fraunhofer Resource Grid, we developed a Gri...
متن کاملStorage Resource Managers: Middleware Components for Grid Storage
The amount of scientific data generated by simulations or collected from large scale experiments have reached levels that cannot be stored in the researcher’s workstation or even in his/her local computer center. Such data are vital to large scientific collaborations dispersed over wide-area networks. In the past, the concept of a Grid infrastructure [1] mainly emphasized the computational aspe...
متن کاملUtilizing Heterogeneous Data Sources in Computational Grid Workflows
Besides computation intensive tasks, the Grid also facilitates sharing and processing very large databases and file systems that are distributed over multiple resources and administrative domains. Although accessing data in the Grid is supported by various lower level tools, end-users find it difficult to utilise these solutions directly. High level environments, such as Grid portal and workflo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computing and Informatics
دوره 33 شماره
صفحات -
تاریخ انتشار 2014